The following notebook showcases the process behind the final project presented for the Eurekathon final in 2021, which was aimed at solving mobility challenges in the city of Matosinhos.

Our strategy aimed at increasing micro-mobility in the city by 10% through the identification of areas with most potential demand. In order to reinforce our approach, we used unsupervised machine learning to classify them according to their specific characteristics.

With this, we were able to provide the city council with a tailor-made tool to expland micromobility infrastructure with a clear target.

Table of contents

1. DATA CLEANING AND EXPLORATION OF SHORT TRIPS

Driving times dataset

Mobility flows dataset

Merge mobility flows and trip distance datasets

Calculate the distance that most micromobility trips cover. We will call these 'short trips' and we will check what is the share of short trips in the dataset

75% of all micromobility trips are below 3.25 kms. We will filter our trips dataset and keep only those, which we consider to be short trips and suitable for micromobility purposes

90% of all our trips dataset correspond to short trips. That's quite a statement

Identify sections number of people starting a trip per statistical section

Identify number of people ending a trip per statistical section

Rank short trip routes according to number of people

Top routes carrying 20% of the people (we are not checking trips among the same sections because we do not have data about them in terms of distance)

Plotting some insights

2. DATA CLEANING AND EXPLORATION OF CENSUS DATA TO ENRICH OUR RESEARCH

3. OBTAINING OUR FINAL TABLE

combine final table

Identification of busiest statistical sections according to the variables taken into acocunt for our final data

4. WHAT WOULD BE THE IMPACT OF INCREASING MICROMOBILITY TRIPS BY 10%? METRICS CALCULATION

Emissions saved

Extracting key figures about all trips and their proportion of short trips

5. MODELING CLUSTER TO UNDERSTAND SIMILARITIES BETWEEN AREAS IN THE CITY

Importing relevant libraries

Running unsupervised learning on our DataFrame

choosing k -- inertia and silhouette method

k 0 4 or 5 looks decent from the above plot

k=5 looks like it would be best option!

Assigning Clusters

Concatenating this with original dataframe

Analyzing each cluster in detail

Running descriptive statistics on each cluster

Map of final clusters